Datadriven Analys och Uppföljning av KTHs Forskning
2025-12-05
This years version of ABM was released about a week ago
Recently released beta version of topics based KTH Research Information
POC for the KTH Indicators dashboard based on consolidated indicators collected from across KTH.
Tests and prep for GDP 2.0 (Gemensamma dataprojektet) - new standard for Swedish project data
Work to use OpenAlex to update DiVA, and to construct bibliometric database
Changes in ABM 2025
OpenAlex
About 200 mln articles, 24 mln book chapters, 10 mln proceedings
|
+--------------------------------+ | | | Data Sources | | | +--------------------------------+ | Clean / Crosscheck / Transform v +--------------------------------+ | | | Curated Data | | | +--------------------------------+ | Write / POST v +--------------------------------+ | | | [S3] Bronze/Silver/Gold | | | +--------------------------------+ | Read / GET v +--------------------------------+ | | | Data Consumer / Client | | | +--------------------------------+
The DAUF project now harvests DiVA publication data from KTH using the OAI-PMH protocol which regularly updates duckdb databases, openly available from object storage:
The database is regularly updated. This is WIP and jocularly codenamed “KaTHarsis”
y art_n_pi pi art_n_r r shr pct
2015 932 ░░░░ 879 ░░░░ ████████░░░░░░░ 51 %
2016 1142 ░░░░░ 1260 ░░░░░░ ███████░░░░░░░░ 48 %
2017 1922 ░░░░░░░░░ 855 ░░░░ ██████████░░░░░ 69 %
2018 2570 ░░░░░░░░░░░░ 683 ░░░ ████████████░░░ 79 %
2019 3329 ░░░░░░░░░░░░░░░ 1218 ░░░░░░ ███████████░░░░ 73 %
2020 3240 ░░░░░░░░░░░░░░░ 880 ░░░░ ████████████░░░ 79 %
2021 2715 ░░░░░░░░░░░░░ 1167 ░░░░░ ███████████░░░░ 70 %
2022 2767 ░░░░░░░░░░░░░ 898 ░░░░ ███████████░░░░ 75 %
2023 3141 ░░░░░░░░░░░░░░░ 734 ░░░ ████████████░░░ 81 %
2024 2530 ░░░░░░░░░░░░ 668 ░░░ ████████████░░░ 79 %
2025 2670 ░░░░░░░░░░░░░ 516 ░░ █████████████░░ 84 %
y con_n_pi pi con_n_r r shr pct
2015 454 ░░░░░ 923 ░░░░░░░░░ █████░░░░░░░░░░ 33 %
2016 675 ░░░░░░░ 743 ░░░░░░░ ███████░░░░░░░░ 48 %
2017 757 ░░░░░░░░ 828 ░░░░░░░░ ███████░░░░░░░░ 48 %
2018 844 ░░░░░░░░ 596 ░░░░░░ █████████░░░░░░ 59 %
2019 993 ░░░░░░░░░░ 813 ░░░░░░░░ ████████░░░░░░░ 55 %
2020 804 ░░░░░░░░ 659 ░░░░░░░ ████████░░░░░░░ 55 %
2021 816 ░░░░░░░░ 670 ░░░░░░░ ████████░░░░░░░ 55 %
2022 919 ░░░░░░░░░ 486 ░░░░░ ██████████░░░░░ 65 %
2023 1231 ░░░░░░░░░░░░ 548 ░░░░░ ██████████░░░░░ 69 %
2024 1182 ░░░░░░░░░░░░ 299 ░░░ ████████████░░░ 80 %
2025 799 ░░░░░░░░ 415 ░░░░ ██████████░░░░░ 66 %
t art_n_pi pi art_n_r r shr pct
2025-01 223 ░░░░ 59 ░ ████████████░░░ 79 %
2025-02 197 ░░░░ 39 ░ ████████████░░░ 83 %
2025-03 192 ░░░░ 30 ░ █████████████░░ 86 %
2025-04 241 ░░░░░ 28 ░ ██████████████░ 90 %
2025-05 151 ░░░ 57 ░ ███████████░░░░ 73 %
2025-06 187 ░░░░ 114 ░░ █████████░░░░░░ 62 %
2025-07 756 ░░░░░░░░░░░░░░ 44 ░ ██████████████░ 95 %
2025-08 203 ░░░░ 54 ░ ████████████░░░ 79 %
2025-09 236 ░░░░ 46 ░ █████████████░░ 84 %
2025-10 163 ░░░ 38 ░ ████████████░░░ 81 %
2025-11 121 ░░ 7 ██████████████░ 95 %
t con_n_pi pi con_n_r r shr pct
2025-01 148 ░░░░░░░░░░░ 38 ░░░ ████████████░░░ 80 %
2025-02 81 ░░░░░░ 14 ░ █████████████░░ 85 %
2025-03 92 ░░░░░░░ 35 ░░░ ███████████░░░░ 72 %
2025-04 90 ░░░░░░░ 30 ░░ ███████████░░░░ 75 %
2025-05 48 ░░░░ 21 ░░ ███████████░░░░ 70 %
2025-06 22 ░░ 52 ░░░░ █████░░░░░░░░░░ 30 %
2025-07 112 ░░░░░░░░ 85 ░░░░░░ █████████░░░░░░ 57 %
2025-08 38 ░░░ 59 ░░░░ ██████░░░░░░░░░ 39 %
2025-09 71 ░░░░░ 48 ░░░░ █████████░░░░░░ 60 %
2025-10 68 ░░░░░ 17 ░ ████████████░░░ 80 %
2025-11 29 ░░ 16 ░ ██████████░░░░░ 64 %
SQL workbench “Pond Pilot” (wasm) for combining DiVA, OpenAlex and other data sources
Preliminary/WIP: matching DiVA pid against Open Alex ids using DOIs/PMIDs; coverage:
n_yes n_no type_diva coverage
65712 8544 Article in journal ██████████████████░░ 88 %
19163 18698 Conference paper ██████████░░░░░░░░░░ 51 %
2460 2881 Chapter in book █████████░░░░░░░░░░░ 46 %
1383 86 Article, review/survey ███████████████████░ 94 %
172 596 Book ████░░░░░░░░░░░░░░░░ 22 %
157 4822 Manuscript (preprint) █░░░░░░░░░░░░░░░░░░░ 3 %
152 313 Article, book review ███████░░░░░░░░░░░░░ 33 %
89 241 Collection (editor) █████░░░░░░░░░░░░░░░ 27 %
54 4109 Report ░░░░░░░░░░░░░░░░░░░░ 1 %
52 216 Conference proceedings (editor) ████░░░░░░░░░░░░░░░░ 19 %
20 699 Other █░░░░░░░░░░░░░░░░░░░ 3 %
8 33 Data set ████░░░░░░░░░░░░░░░░ 20 %
5 1281 Doctoral thesis, monograph ░░░░░░░░░░░░░░░░░░░░ 0 %
4 5547 Doctoral thesis, comprehensive summary ░░░░░░░░░░░░░░░░░░░░ 0 %
1 54 Artistic output ░░░░░░░░░░░░░░░░░░░░ 2 %
1 2369 Licentiate thesis, comprehensive summary ░░░░░░░░░░░░░░░░░░░░ 0 %
1 42086 Student thesis ░░░░░░░░░░░░░░░░░░░░ 0 %
0 960 Licentiate thesis, monograph ░░░░░░░░░░░░░░░░░░░░ 0 %
0 290 Manuscript ░░░░░░░░░░░░░░░░░░░░ 0 %
0 621 Patent ░░░░░░░░░░░░░░░░░░░░ 0 %
GDP (Gemensamma data för projekt) is an effort of a number of Swedish research funders to create a common data model for project data. The five funding agencies Energimyndigheten, Formas, Forte, Vetenskapsrådet and Vinnova is developing a standard which enables sharing of open data about fundings and related information.
The standard is developed in cooperation with a reference group including universities and other organisations within the university sector, KTH is a participant in the reference group.
Version 2 of the GDP API has recently been released. Tooling has been developed at KTH for harvesting data from the different sources using this version of the API, see https://github.com/KTH-Library/gdp.
Regular harvesting of data from the API is now available from object storage “minio” at KTH: https://data.bibliometrics.lib.kth.se/projects/gdp/gdp.db
Related activities
Please provide your input in chat or verbally.
If you prefer to give your feedback later or come up with questions after this demo, you are always welcome to email us at biblioteket@kth.se.
DAUF - Demo 12 - 2025-12-05